Similarity Clustering of Proteins Using Substantive Knowledge and Reconstruction of Evolutionary Gene Histories in Herpesvirus

نویسندگان

Paul Kellam

Boris Mirkin

Renata Camargo

Trevor Fenner

George Loizou

چکیده

The issue of clustering proteins into homologous protein families (HPFs) has attracted considerable attention by researchers. On one side, many databases of protein families have been developed by using popular sequence alignment tools and relatively simple clustering methods followed by extensive manual curation. On the other side, more elaborate clustering approaches have been used, yet with a very limited degree of success. This paper advocates an approach to clustering protein families involving knowledge of the protein functions to adjust the parameter of similarity scale shift. One more source of external information is utilised as we proceed to reconstruct HPF evolutionary histories over an evolutionary tree: the consistency between these histories and information on gene arrangement in the genomes is used to narrow down the choice of the clustering. Response to Reviewers: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 Similarity Clustering of Proteins Using Substantive Knowledge and Reconstruction of Evolutionary Gene Histories in Herpesvirus ∗ Boris Mirkin, Renata Camargo, Trevor Fenner, George Loizou, Paul Kellam

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A partition-based algorithm for clustering large-scale software systems

Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...

متن کامل

Improving Imbalanced data classification accuracy by using Fuzzy Similarity Measure and subtractive clustering

Classification is an one of the important parts of data mining and knowledge discovery. In most cases, the data that is utilized to used to training the clusters is not well distributed. This inappropriate distribution occurs when one class has a large number of samples but while the number of other class samples is naturally inherently low. In general, the methods of solving this kind of prob...

متن کامل

Determining Difference in Evolutionary Variation of Bacterial RecA proteins vs 16SrRNA Genes by using 16s_Toxonomy Tree

Background and Aims: The rate of variation in various genes of a bacterial species is different during evolution. Therefore, in systematic bacterial studies many researchers compare the phylogenetic tree of a particular gene to the standard tree of an rRNA gene. Regarding the importance of 16SrRNA gene and the evolutional process of RecA protein family, we investigated the changes in the select...

متن کامل

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

Multi-layer Clustering Topology Design in Densely Deployed Wireless Sensor Network using Evolutionary Algorithms

Due to the resource constraint and dynamic parameters, reducing energy consumption became the most important issues of wireless sensor networks topology design. All proposed hierarchy methods cluster a WSN in different cluster layers in one step of evolutionary algorithm usage with complicated parameters which may lead to reducing efficiency and performance. In fact, in WSNs topology, increasin...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Similarity Clustering of Proteins Using Substantive Knowledge and Reconstruction of Evolutionary Gene Histories in Herpesvirus

نویسندگان

چکیده

منابع مشابه

A partition-based algorithm for clustering large-scale software systems

Improving Imbalanced data classification accuracy by using Fuzzy Similarity Measure and subtractive clustering

Determining Difference in Evolutionary Variation of Bacterial RecA proteins vs 16SrRNA Genes by using 16s_Toxonomy Tree

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Multi-layer Clustering Topology Design in Densely Deployed Wireless Sensor Network using Evolutionary Algorithms

عنوان ژورنال:

اشتراک گذاری